architectural style
ArchiLense: A Framework for Quantitative Analysis of Architectural Styles Based on Vision Large Language Models
Zhong, Jing, Yin, Jun, Li, Peilin, Zeng, Pengyu, Zang, Miao, Luo, Ran, Lu, Shuai
Architectural cultures across regions are characterized by stylistic diversity, shaped by historical, social, and technological contexts in addition to geograph-ical conditions. Understanding architectural styles requires the ability to describe and analyze the stylistic features of different architects from various regions through visual observations of architectural imagery. However, traditional studies of architectural culture have largely relied on subjective expert interpretations and historical literature reviews, often suffering from regional biases and limited ex-planatory scope. To address these challenges, this study proposes three core contributions: (1) We construct a professional architectural style dataset named ArchDiffBench, which comprises 1,765 high-quality architectural images and their corresponding style annotations, collected from different regions and historical periods. (2) We propose ArchiLense, an analytical framework grounded in Vision-Language Models and constructed using the ArchDiffBench dataset. By integrating ad-vanced computer vision techniques, deep learning, and machine learning algo-rithms, ArchiLense enables automatic recognition, comparison, and precise classi-fication of architectural imagery, producing descriptive language outputs that ar-ticulate stylistic differences. (3) Extensive evaluations show that ArchiLense achieves strong performance in architectural style recognition, with a 92.4% con-sistency rate with expert annotations and 84.5% classification accuracy, effec-tively capturing stylistic distinctions across images. The proposed approach transcends the subjectivity inherent in traditional analyses and offers a more objective and accurate perspective for comparative studies of architectural culture.
Don't be Fooled: The Misinformation Effect of Explanations in Human-AI Collaboration
Spitzer, Philipp, Holstein, Joshua, Morrison, Katelyn, Holstein, Kenneth, Satzger, Gerhard, Kühl, Niklas
Across various applications, humans increasingly use black-box artificial intelligence (AI) systems without insight into these systems' reasoning. To counter this opacity, explainable AI (XAI) methods promise enhanced transparency and interpretability. While recent studies have explored how XAI affects human-AI collaboration, few have examined the potential pitfalls caused by incorrect explanations. The implications for humans can be far-reaching but have not been explored extensively. To investigate this, we ran a study (n=160) on AI-assisted decision-making in which humans were supported by XAI. Our findings reveal a misinformation effect when incorrect explanations accompany correct AI advice with implications post-collaboration. This effect causes humans to infer flawed reasoning strategies, hindering task execution and demonstrating impaired procedural knowledge. Additionally, incorrect explanations compromise human-AI team-performance during collaboration. With our work, we contribute to HCI by providing empirical evidence for the negative consequences of incorrect explanations on humans post-collaboration and outlining guidelines for designers of AI.
The Impact of Feature Embedding Placement in the Ansatz of a Quantum Kernel in QSVMs
Salmenperä, Ilmo, Kuhtarskis, Ilmars, van de Griend, Arianne Meijer, Nurminen, Jukka K.
Designing a useful feature map for a quantum kernel is a critical task when attempting to achieve an advantage over classical machine learning models. The choice of circuit architecture, i.e. how feature-dependent gates should be interwoven with other gates is a relatively unexplored problem and becomes very important when using a model of quantum kernels called Quantum Embedding Kernels (QEK). We study and categorize various architectural patterns in QEKs and show that existing architectural styles do not behave as the literature supposes. We also produce a novel alternative architecture based on the old ones and show that it performs equally well while containing fewer gates than its older counterparts.
Zero-shot Building Age Classification from Facade Image Using GPT-4
Zeng, Zichao, Goo, June Moh, Wang, Xinglei, Chi, Bin, Wang, Meihui, Boehm, Jan
A building's age of construction is crucial for supporting many geospatial applications. Much current research focuses on estimating building age from facade images using deep learning. However, building an accurate deep learning model requires a considerable amount of labelled training data, and the trained models often have geographical constraints. Recently, large pre-trained vision language models (VLMs) such as GPT-4 Vision, which demonstrate significant generalisation capabilities, have emerged as potential training-free tools for dealing with specific vision tasks, but their applicability and reliability for building information remain unexplored. In this study, a zero-shot building age classifier for facade images is developed using prompts that include logical instructions. Taking London as a test case, we introduce a new dataset, FI-London, comprising facade images and building age epochs. Although the training-free classifier achieved a modest accuracy of 39.69%, the mean absolute error of 0.85 decades indicates that the model can predict building age epochs successfully albeit with a small bias. The ensuing discussion reveals that the classifier struggles to predict the age of very old buildings and is challenged by fine-grained predictions within 2 decades. Overall, the classifier utilising GPT-4 Vision is capable of predicting the rough age epoch of a building from a single facade image without any training.
SCAPE: Searching Conceptual Architecture Prompts using Evolution
Lim, Soo Ling, Bentley, Peter J, Ishikawa, Fuyuki
Conceptual architecture involves a highly creative exploration of novel ideas, often taken from other disciplines as architects consider radical new forms, materials, textures and colors for buildings. While today's generative AI systems can produce remarkable results, they lack the creativity demonstrated for decades by evolutionary algorithms. SCAPE, our proposed tool, combines evolutionary search with generative AI, enabling users to explore creative and good quality designs inspired by their initial input through a simple point and click interface. SCAPE injects randomness into generative AI, and enables memory, making use of the built-in language skills of GPT-4 to vary prompts via text-based mutation and crossover. We demonstrate that compared to DALL-E 3, SCAPE enables a 67% improvement in image novelty, plus improvements in quality and effectiveness of use; we show that in just 3 iterations SCAPE has a 24% image novelty increase enabling effective exploration, plus optimization of images by users. We use more than 20 independent architects to assess SCAPE, who provide markedly positive feedback.
Generative AI and the History of Architecture
Ploennigs, Joern, Berger, Markus
Recent generative AI platforms are able to create texts or impressive images from simple text prompts. This makes them powerful tools for summarizing knowledge about architectural history or deriving new creative work in early design tasks like ideation, sketching and modelling. But, how good is the understanding of the generative AI models of the history of architecture? Has it learned to properly distinguish styles, or is it hallucinating information? In this chapter, we investigate this question for generative AI platforms for text and image generation for different architectural styles, to understand the capabilities and boundaries of knowledge of those tools. We also analyze how they are already being used by analyzing a data set of 101 million Midjourney queries to see if and how practitioners are already querying for specific architectural concepts.
ArchiGuesser -- AI Art Architecture Educational Game
Ploennigs, Joern, Berger, Markus, Carnein, Eva
The use of generative AI in education is a controversial topic. Current technology offers the potential to create educational content from text, speech, to images based on simple input prompts. This can enhance productivity by summarizing knowledge and improving communication, quickly adjusting to different types of learners. Moreover, generative AI holds the promise of making the learning itself more fun, by responding to user inputs and dynamically generating high-quality creative material. In this paper we present the multisensory educational game ArchiGuesser that combines various AI technologies from large language models, image generation, to computer vision to serve a single purpose: Teaching students in a playful way the diversity of our architectural history and how generative AI works.
Machine Learning Advances aiding Recognition and Classification of Indian Monuments and Landmarks
Paul, Aditya Jyoti, Ghose, Smaranjit, Aggarwal, Kanishka, Nethaji, Niketha, Pal, Shivam, Purkayastha, Arnab Dutta
Tourism in India plays a quintessential role in the country's economy with an estimated 9.2% GDP share for the year 2018. With a yearly growth rate of 6.2%, the industry holds a huge potential for being the primary driver of the economy as observed in the nations of the Middle East like the United Arab Emirates. The historical and cultural diversity exhibited throughout the geography of the nation is a unique spectacle for people around the world and therefore serves to attract tourists in tens of millions in number every year. Traditionally, tour guides or academic professionals who study these heritage monuments were responsible for providing information to the visitors regarding their architectural and historical significance. However, unfortunately this system has several caveats when considered on a large scale such as unavailability of sufficient trained people, lack of accurate information, failure to convey the richness of details in an attractive format etc. Recently, machine learning approaches revolving around the usage of monument pictures have been shown to be useful for rudimentary analysis of heritage sights. This paper serves as a survey of the research endeavors undertaken in this direction which would eventually provide insights for building an automated decision system that could be utilized to make the experience of tourism in India more modernized for visitors.
Traveling tourist Part 1: Import WikiData to Neo4j with Neosemantics library
After a short summer break, I have prepared a new blog series. In this first part, we will construct a knowledge graph of monuments located in Spain. As you might know, I have lately gained a lot of interest and respect for the wealth of knowledge that is available through the WikiData API. We will continue honing our SPARQL syntax knowledge and fetch the information regarding the monuments located in Spain from the WikiData API. I wasn't aware of this before, but scraping the RDF data available online and importing it into Neo4j is such a popular topic that Dr. Jesus Barrasa developed a Neosemantics library to help us with this process.